Skip to content

feat(records): add list (filter) endpoint with read data classes#2680

Closed
andersfylling wants to merge 9 commits into
masterfrom
feat/records-list
Closed

feat(records): add list (filter) endpoint with read data classes#2680
andersfylling wants to merge 9 commits into
masterfrom
feat/records-list

Conversation

@andersfylling

@andersfylling andersfylling commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Adds list() (filter) to the Records API.

POST /streams/{id}/records/filterRecordList, with support for filtering by time range, DM filter expressions, source selection, sorting, and optional type information.

Added stream type as future rate limits will depend on it. For now I've implemented the lowest rate limit, which is related to immutable streams.

https://cognitedata.atlassian.net/browse/HVD-1262

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.67%. Comparing base (7934665) to head (0a56f21).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2680      +/-   ##
==========================================
+ Coverage   93.65%   93.67%   +0.01%     
==========================================
  Files         498      498              
  Lines       50391    50542     +151     
==========================================
+ Hits        47196    47347     +151     
  Misses       3195     3195              
Files with missing lines Coverage Δ
cognite/client/_api/data_modeling/records.py 100.00% <100.00%> (ø)
cognite/client/_sync_api/data_modeling/records.py 100.00% <100.00%> (ø)
...nite/client/data_classes/data_modeling/__init__.py 100.00% <ø> (ø)
...gnite/client/data_classes/data_modeling/records.py 98.98% <100.00%> (+1.26%) ⬆️
cognite/client/utils/_concurrency.py 92.85% <100.00%> (+0.05%) ⬆️
...s_unit/test_api/test_data_modeling/test_records.py 100.00% <100.00%> (ø)
tests/tests_unit/test_docstring_examples.py 100.00% <100.00%> (ø)

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@andersfylling andersfylling force-pushed the feat/records-list branch 3 times, most recently from 712d288 to eeb284f Compare June 15, 2026 08:13
Add RecordsAPI.list, a cursorless POST to /streams/{streamId}/records/filter
returning a RecordList (max 1000 records). Introduces the records read model:

- Record / RecordList read data classes (clean CogniteResource, no RecordId
  multiple-inheritance; RecordList carries optional `typing`).
- TimeRange (gte/gt/lte/lt) for last_updated_time and RecordSourceSelector for
  source/property selection; reuse the data-modeling Filter DSL, InstanceSort,
  and TypeInformation.
- Add a READ op to RecordsConcurrencyOperation + a read semaphore to the
  records concurrency config (reads no longer borrow the write semaphore).
- Make _records_url append the path suffix literally (encode only stream_id)
  so "/filter" isn't percent-encoded.

Also register the records API module in the docstring-example doctest runner.
target_units is intentionally deferred to a follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
andersfylling and others added 5 commits June 15, 2026 12:11
…l pyproject.toml deps

Switch list() from direct _post() to the _list() helper (matching the
instances API pattern), and remove accidentally added Poetry 1.x
dependency sections (including ty as a runtime dep).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…on all endpoints

- Pass resource_path to _list() so it doesn't fall back to the missing
  _RESOURCE_PATH class var on RecordsAPI, fixing test failures
- Add module-level _DEFAULT_STREAM_TYPE = "immutable" constant
- _get_semaphore now requires stream_type explicitly (no default)
- All endpoints (delete, ingest, upsert, list) accept stream_type and
  pass an explicit semaphore, ready for future HierarchicalBoundedSemaphore
- Regenerate sync API

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…limit)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@andersfylling andersfylling marked this pull request as ready for review June 15, 2026 14:24
@andersfylling andersfylling requested review from a team as code owners June 15, 2026 14:24

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for filtering and listing records in a stream by introducing the list method to both the async and sync RecordsAPI, along with supporting data classes (Record, RecordList, TimeRange, RecordSourceSelector) and unit tests. Feedback on the changes suggests enhancing type safety by using Literal["immutable", "mutable"] instead of str for the stream_type parameter. Additionally, it is recommended to remove the unused stream_type parameter from _get_semaphore to prevent Liskov Substitution Principle violations and avoid unnecessary type-ignore comments.

Comment thread cognite/client/_api/data_modeling/records.py
Comment thread cognite/client/_api/data_modeling/records.py Outdated
Comment thread cognite/client/_api/data_modeling/records.py
Comment thread cognite/client/_api/data_modeling/records.py Outdated
Comment thread cognite/client/_api/data_modeling/records.py
Comment thread cognite/client/_api/data_modeling/records.py Outdated
Comment thread cognite/client/_api/data_modeling/records.py
Comment thread cognite/client/_api/data_modeling/records.py Outdated
Comment thread cognite/client/_api/data_modeling/records.py
andersfylling and others added 2 commits June 15, 2026 16:27
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@haakonvt haakonvt left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the API implementation, will try to cover data classes + tests tomorrow

Comment on lines 38 to 42
_OPERATION_TO_RATE_LIMIT: ClassVar[dict[str, RecordsConcurrencyOperation]] = {
"read": RecordsConcurrencyOperation.READ,
"write": RecordsConcurrencyOperation.WRITE,
"delete": RecordsConcurrencyOperation.WRITE,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this is an unnecessary indirection 😄

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cleaning it up here: #2688

the concurrency setup is a bit complicated, so wrote something dumbed down in this PR.

items: RecordId | Sequence[RecordId],
*,
stream_id: str,
stream_type: StreamType = _DEFAULT_STREAM_TYPE,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the usage here at all - it seems to only be used by _get_semaphore? 🤔

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be related to an upcoming concurrency config: #2688

Right now it's a noop. but all operations are stream type bound.

identifiers=RecordIdSequence.load(items),
wrap_ids=True,
resource_path=self._records_url(stream_id),
override_semaphore=self._get_semaphore("delete", stream_type),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can't infer stream_type, you need to collapse this to a single semaphore setting that works in all cases

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I quite understand, could you elaborate?

Comment on lines +228 to +229
(max 1000). To page over a large time window, issue multiple calls with partitioned
``last_updated_time`` ranges.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to support pagination, it is one of the key features of the SDK 😄

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pagination is not a concept for this api endpoint, sadly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDK should still do it to avoid 100 customers doing it in 100 wrong ways xD

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying we should postponed adding streams and records to the SDK until we can add paging in the backend?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No 😅 I have not used records - but I just assumed you can/need to paginate by passing ie last timestamp+1 and so on

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read your docstring now lol. I want the SDK to handle this automatically:

To page over a large time window, issue multiple calls with partitioned…

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, my view is we can do it. But I don't think we should support niche pagination. So unless the API supports it, I don't want to implement this for every SDK.

filter: Filter | None = None,
sources: Sequence[RecordSourceSelector] | None = None,
sort: Sequence[InstanceSort] | InstanceSort | None = None,
limit: int = 10,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a default constant that should be used

filter (Filter | None): Filter expression (see :mod:`cognite.client.data_classes.filters`).
sources (Sequence[RecordSourceSelector] | None): Which container properties to return.
sort (Sequence[InstanceSort] | InstanceSort | None): Sort specification(s); up to 5.
limit (int): Maximum number of records to return (1-1000). Defaults to 10.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
limit (int): Maximum number of records to return (1-1000). Defaults to 10.
limit (int): Maximum number of records to return (1-1000).

self,
stream_id: str,
*,
stream_type: StreamType = _DEFAULT_STREAM_TYPE,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same argument as above, this will confuse users. If they specify the wrong value here, the list method will still work.

Suggested change
stream_type: StreamType = _DEFAULT_STREAM_TYPE,

@andersfylling

Copy link
Copy Markdown
Contributor Author

I'll remove the stream_type parameter for now, and make it part of the follow up PR. please ignore these for now

…ment

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@haakonvt haakonvt deleted the feat/records-list branch June 22, 2026 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants